Causal Inference for Case - Control Studies

نویسندگان

  • Sherri Rose
  • Nicholas Jewell
  • Ira Tager
  • Mark van der Laan
چکیده

Causal Inference for Case-Control Studies by Sherri Rose Doctor of Philosophy in Biostatistics University of California, Berkeley Professor Mark van der Laan, Chair Case-control study designs are frequently used in public health and medical research to assess potential risk factors for disease. These study designs are particularly attractive to investigators researching rare diseases, as they are able to sample known cases of disease, vs. following a large number of subjects and waiting for disease onset in a relatively small number of individuals. The data-generating experiment in case-control study designs involves an additional complexity called biased sampling. That is, one assumes the underlying experiment that randomly samples a unit from a target population, measures baseline characteristics, assigns an exposure, and measures a final binary outcome, but one samples from the conditional probability distribution, given the value of the binary outcome. One still desires to assess the causal effect of exposure on the binary outcome for the target population. The targeted maximum likelihood estimator of a causal effect of treatment on the binary outcome based on such case-control studies is presented. Our proposed casecontrol-weighted targeted maximum likelihood estimator for case-control studies relies on knowledge of the true prevalence probability, or a reasonable estimate of this probability, to eliminate the bias of the case-control sampling design. We use the prevalence probability in case-control weights, and our case-control weighting scheme successfully maps the targeted maximum likelihood estimator for a random sample into a method for case-control sampling. Individually matched case-control study designs are commonly implemented in the field of public health. While matching is intended to eliminate confounding, the main potential benefit of matching in case-control studies is a gain in efficiency. We investigate the use of the case-control-weighted targeted maximum likelihood estimator to estimate causal effects in matched case-control study designs. We also compare the case-control-weighted targeted maximum likelihood estimator in matched and unmatched designs in an effort to determine which design yields the most information about the causal effect. In many practical situations where a causal effect is the parameter of interest, researchers may be better served using an unmatched design. We also consider two-stage sampling designs, including so-called nested casecontrol studies, where one takes a random sample from a target population and 1 completes measurements on each subject in the first stage. The second stage involves drawing a subsample from the original sample, collecting additional data on the subsample. This data structure can be viewed as a missing data structure on the full-data structure collected in the second stage of the study. We propose an inverseprobability-of-censoring-weighted targeted maximum likelihood estimator in twostage sampling designs. Two-stage designs are also common for prediction research questions. We present an analysis using super learner in nested case-control data from a large Kaiser Permanente database to generate a function for mortality risk prediction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Symposium: Case Selection, Case Studies, and Causal Inference Introduction

For scholars concerned with causal inference, how should cases be selected in case study research? This symposium builds on previously published arguments by James Fearon and David Laitin (2008), who favor random sampling in case study analysis, and by John Gerring (2008), who favors purposive selection. The statistician David Freedman—long an advocate of case studies as an important research t...

متن کامل

Causal inference: the case of hygiene and health.

A fundamental goal of applied epidemiology is to determine whether a relationship between 2 factors is causal. For example, the primary purpose of an outbreak investigation is to identify what factor(s) “caused” the problem, and the purpose of the Study of the Efficacy of Nosocomial Infection Control (SENIC Project) was to measure the effect of infection control and prevention programs on rates...

متن کامل

Comment: Causal Inference in the Medical Area

It is an honor to be a discussant to the Morris Hansen Lecture, and a pleasure to be discussing Don Rubin’s talk. Dr. Rubin has clarified over the years many of the deep issues relating to causal inference. Let me start with a story. About 20 years ago when I was teaching at UCLA, I was eating breakfast one morning at my kitchen table, and my twoand-a-half-year-old daughter was in the next room...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011